Mining Correlated Patterns with Multiple Minimum All-Confidence Thresholds
نویسندگان
چکیده
Correlated patterns are an important class of regularities that exist in a database. The all-con f idence measure has been widely used to discover the patterns in real-world applications. This paper theoretically analyzes the allconfidence measure, and shows that, although the measure satisfies the nullinvariant property, mining correlated patterns involving both frequent and rare items with a single minimum all-confidence (minAllCon f ) threshold value causes the “rare item problem” if the items’ frequencies in a database vary widely. The problem involves either finding very short length correlated patterns involving rare items at a high minAllCon f threshold, or generating a huge number of patterns at a low minAllCon f threshold. The cause for the problem is that the single minAllCon f threshold was not sufficient to capture the items’ frequencies in a database effectively. The paper also introduces an alternative model of correlated patterns using the concept of multiple minAllCon f thresholds. The proposed model facilitates the user to specify a different minAllCon f threshold for each pattern to reflect the varied frequencies of items within it. Experiment results show that the proposed model is very effective.
منابع مشابه
High-Utility Sequential Pattern Mining with Multiple Minimum Utility Thresholds
High-utility sequential pattern mining is an emerging topic in recent decades and most algorithms were designed to identify the complete set of high-utility sequential patterns under the single minimum utility threshold. In this paper, we first propose a novel framework called high-utility sequential pattern mining with multiple minimum utility thresholds to mine high utility sequential pattern...
متن کاملFrequent Pattern Mining under Multiple Support Thresholds
Traditional methods use a single minimum support threshold to find out the complete set of frequent patterns. However, in real word applications, using single minimum item support threshold is not adequate since it does not reflect the nature of each item. If single minimum support threshold is set too low, a huge amount of patterns are generated including uninteresting patterns. On the other h...
متن کاملA Survey on Association Rule Mining Using Apriori Based Algorithm and Hash Based Methods
Association rule mining is the most important technique in the field of data mining. The main task of association rule mining is to mine association rules by using minimum support thresholds decided by the user, to find the frequent patterns. Above all, most important is research on increment association rules mining. The Apriori algorithm is a classical algorithm in mining association rules. T...
متن کاملTFP-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds
Conventional frequent pattern mining algorithms require some user-specified minimum support, and then mine frequent patterns with support values that are higher than the minimum support. As it is difficult to predict how many frequent patterns will be mined with a specified minimum support, the Top-k mining concept has been proposed. The Top-k Mining concept is based on an algorithm for mining ...
متن کاملEfficient Mining of High Average-Utility Itemsets with Multiple Minimum Thresholds
High average-utility itemsets mining (HAUIM) is a key data mining task, which aims at discovering high average-utility itemsets (HAUIs) by taking itemset length into account in transactional databases. Most of these algorithms only consider a single minimum utility threshold for identifying the HAUIs. In this paper, we address this issue by introducing the task of mining HAUIs with multiple min...
متن کامل